Exposure database#
This notebook provides a systematic approach for analyzing and visualizing infrastructure networks across different European countries. It iterates through a list of countries and network types, retrieves the relevant geographic data, processes it to handle missing values, and generates visualizations to provide insights into the infrastructure distribution and characteristics within each country. Raw geospatial data is sourced from multiple databases, including OSM. Integration with OSM data is achieved using the OSMnx Python library, which simplifies downloading, modeling, analyzing, and visualizing OSM data. The integrated data is associated with EU codes representing all European countries (e.g., FR for France, NL for Netherlands). Geometrical representations, such as lines for roads, points for schools or substations, and polygons for areas like educational districts or power grids, are processed. Finally, the output is stored in vector Geoparquet files, named according to the network code, geometry type, and EU code (e.g., Network_code_geometry_EU-Code.parquet).
# Path to the tmp_cache directory
cache_dir = os.path.join(os.getcwd(), "tmp_cache")
# Create the directory if it doesn't exist
os.makedirs(cache_dir, exist_ok=True)
# Example of saving a file in the cache directory
with open(os.path.join(cache_dir, "cache_file.txt"), "w") as file:
file.write("This is a cached file.")
# Define the folder where the database will be stored
# TODO: We could set a .miraca folder as a default for all packages
Miraca_Exposure_Database_path = Path("~/miraca_exposure_database").expanduser()
This dictionary maps the names of European countries to their respective ISO 3166-1 alpha-2 codes. These codes are two-letter country codes defined in the ISO 3166-1 standard.The dictionary keys are country names (str) and values are their ISO 3166-1 alpha-2 codes (str).
EU_code = {
"Albania": "AL",
"Austria": "AT",
"Belgium": "BE",
"Bulgaria": "BG",
"Croatia": "HR",
"Cyprus": "CY",
"Czechia": "CZ",
"Denmark": "DK",
"Estonia": "EE",
"Finland": "FI",
"France": "FR",
"Germany": "DE",
"Greece": "GR",
"Hungary": "HU",
"Iceland": "IS",
"Ireland": "IE",
"Italy": "IT",
"Latvia": "LV",
"Liechtenstein": "LI",
"Lithuania": "LT",
"Luxembourg": "LU",
"Malta": "MT",
"Montenegro": "ME",
"Netherlands": "NL",
"North_Macedonia": "MK",
"Norway": "NO",
"Poland": "PL",
"Portugal": "PT",
"Romania": "RO",
"Serbia": "RS",
"Slovakia": "SK",
"Slovenia": "SI",
"Spain": "ES",
"Sweden": "SE",
"Switzerland": "CH",
"United_Kingdom": "UK",
"Svalbard_and_Jan_Mayen": "SJ",
}
This dictionary maps the names of European countries to the cost from GEM. The dictionary keys are country names (str) and values are the GEM costs (float).
Gem_cost = {
"Austria": 1454.26,
"Belgium": 1349.35,
"Bulgaria": 530.44,
"Cyprus": 1221.97,
"Czech_Republic": 1167.68,
"Denmark": 2629.95,
"Estonia": 953.19,
"Finland": 2005.17,
"France": 2225.00,
"Germany": 1869.00,
"Greece": 954.97,
"Hungary": 534.89,
"Ireland": 1640.27,
"Italy": 1424.00,
"Latvia": 937.17,
"Lithuania": 962.09,
"Luxembourg": 1418.66,
"Malta": 931.83,
"Netherlands": 1401.43,
"Poland": 1440.02,
"Portugal": 801.00,
"Romania": 775.19,
"Slovakia": 849.95,
"Slovenia": 1110.72,
"Spain": 747.60,
"Sweden": 3528.85,
}
def map_network_to_code(network):
"""
This function maps network names to shorter codes according to a predefined mapping.
Arguments:
network (str): The network name to be mapped.
Returns:
str: The corresponding code for the network, or the original network if not found in the mapping.
"""
network_mapping = [
("Education", "Edu"),
("Healthcare", "Health"),
("Airports", "Airport"),
("Ports", "Port"),
("Roadway", "Road"),
("Railway", "Rail"),
("Telecommunication", "Telecom"),
("Electricity", "Elec"),
]
for network_name, network_code in network_mapping:
if network == network_name:
return network_code
return network
def get_network_tags(network):
"""
This function takes an network type and returns the corresponding OpenStreetMap tags.
Parameters:
network (str): The type of network for which OSM tags are required.
Returns:
dict: A dictionary containing OSM tags associated with the specified network type.
Returns an empty dictionary if the network type is not found.
"""
network_tags = {
"Education": {"amenity": "school"},
"Healthcare": {"amenity": "hospital"},
"Airports": {"aeroway": "aerodrome"},
"Ports": {"waterway": "harbour", "landuse": "harbour", "man_made": "pier"},
"Roadway": {
"highway": [
"motorway",
"primary",
"secondary",
"tertiary",
"motorway_link",
"primary_link",
"secondary_link",
"tertiary_link",
]
},
"Railway": {"railway": ["rail", "station"]},
"Telecommunication": {"man_made": ["communications_tower", "mast"]},
"Electricity": {
"power": [
"cable",
"plant",
"pole",
"substation",
"tower",
"line",
"minor_line",
]
},
}
return network_tags.get(network)
def get_network_columns(network):
"""
This function returns the list of columns associated with a particular network type.
Parameters:
network (str): The type of network for which column names are required.
Returns:
list: A list of column names associated with the specified network type.
"""
network_columns = {
"Education": [
"name",
"geometry",
"amenity",
"building:levels",
"material",
"height",
],
"Healthcare": [
"name",
"geometry",
"amenity",
"emergency",
"building:levels",
"height",
"beds",
],
"Airports": [
"name",
"geometry",
"aeroway",
"ref",
"int_name",
"ele",
"operator",
"surface",
"iata",
"icao",
"passengers",
"landuse",
"layer",
],
"Ports": [
"name",
"geometry",
"waterway",
"man_made",
"landuse",
"ele",
"surface",
"tracktype",
"material",
"incline",
"layer",
"smoothness",
"bridge",
"oneway",
"maxspeed",
"maxweight",
],
"Roadway": [
"geometry",
"highway",
"name",
"int_name",
"int_ref",
"maxspeed",
"lanes",
"oneway",
"bridge",
"sidewalk",
"surface",
"smoothness",
"maxheight",
"tunnel",
"layer",
],
"Railway": [
"name",
"geometry",
"maxspeed",
"voltage",
"bridge",
"layer",
"tunnel",
"cutting",
"electrified",
"usage",
"embankment",
"frequency",
"tracks",
"gauge",
],
"Telecommunication": [
"geometry",
"man_made",
"tower_type",
"mast_type",
"height",
"ele",
"material",
],
"Electricity": [
"name",
"geometry",
"power",
"cables",
"voltage",
"location",
"frequency",
"type",
"wires",
"layer",
"disused",
"operator",
"circuits",
"usage",
],
}
return network_columns.get(network)
def replace_nan_values(gdf, column_name, value):
"""
This function replaces NaN values in a specified column of a GeoDataFrame.
Parameters:
gdf (GeoDataFrame): The GeoDataFrame to process.
column_name (str): The name of the column in which to replace NaN values.
value (str): The value to use as a replacement.
Returns:
GeoDataFrame: The processed GeoDataFrame.
"""
gdf[column_name] = gdf[column_name].fillna(value)
return gdf
Select, download and save geospatial data using osmnx#
def select_data_osmnx(country, network, database):
"""
Fetches and processes geospatial data for a specific network type in a given country using OSMnx.
Parameters:
country (str): The name of the country for which data is being fetched.
network (str): The type of network for which data is being fetched.
database (str): The name of the database.
Returns:
GeoDataFrame or None: A GeoDataFrame containing the processed data if successful,
None if there was an error fetching the data.
Raises:
Exception: Re-raises any unexpected exceptions encountered during data retrieval.
"""
try:
# Get the abbreviation for the country
country_abbr = EU_code.get(country, country)
# Get the tags for the given network type
tags = get_network_tags(network)
if tags is None:
print(f"No tags defined for network type '{network}'")
return
network_code = map_network_to_code(network)
features = ox.features_from_place(country, tags=tags)
# Define a dictionary that maps each network to the preferred geometry types
network_geometry = {
"Education": ["Point", "Polygon"],
"Healthcare": ["Point", "Polygon"],
"Airports": ["Point", "Polygon"],
"Ports": ["Point", "Polygon"],
"Roadway": ["LineString"],
'Railway': ['LineString', 'Point'],
"Telecommunication": ["Point"],
"Electricity": ["Point", "LineString"],
}
preferred_geometries = network_geometry.get(network)
if preferred_geometries is None:
print(f"No preferred geometries defined for network type '{network}'")
return
# Initialize gdf as an empty GeoDataFrame
gdf = gpd.GeoDataFrame()
# Initialize an empty list to hold each GeoDataFrame for different geometries
gdf_list = []
# Define colors for each geometry type
geometry_colors = {
"Point": "#ED5861", # Red for points
"LineString": "#4069F6", # Black for lines
"Polygon": "#4069F6", # Green for polygons
}
# Keep only the features with the preferred geometries
for geom_type in preferred_geometries:
geom_features = features[features["geometry"].geom_type == geom_type]
geom_features = geom_features.dropna(subset=["geometry"])
if "geometry" not in geom_features.columns:
geom_features["geometry"] = None
columns = get_network_columns(network)
# Explicitly create GeoDataFrame
gdf = gpd.GeoDataFrame(geom_features, columns=columns)
# Add columns for country name and abbreviation
gdf["Country"] = country
gdf["Country_code"] = country_abbr
gdf["Geometry"] = geom_type
# Set the CRS for the GeoDataFrame
gdf.crs = "EPSG:4326"
if network in ["Roadway", "Railway"]:
# Call the function to replace NaN values
gdf = replace_nan_values(gdf, "bridge", "no")
gdf = replace_nan_values(gdf, "tunnel", "no")
gdf = replace_nan_values(gdf, "layer", 0)
# Convert 'layer' column to numeric, handling errors by setting them to NaN
gdf["layer"] = pd.to_numeric(gdf["layer"], errors="coerce")
if network == "Education" and country in Gem_cost:
gdf["Cost_per_area"] = "{:.2f}".format(Gem_cost.get(country, np.nan))
# Save to Geoparquet
output_directory = os.path.join(
Miraca_Exposure_Database_path, network, database, country_abbr
)
os.makedirs(output_directory, exist_ok=True)
geom_type_to_letter = {
"Point": "P",
"LineString": "L",
"Polygon": "Α",
# Add more geometry types as needed
}
# Get the specific letter for the current geometry type
geom_type_letter = geom_type_to_letter.get(geom_type, "")
output_filename = os.path.join(
output_directory,
f"{network_code}{geom_type_letter}_{country_abbr}.parquet",
)
gdf.to_parquet(output_filename, engine="pyarrow", compression="snappy")
# Append the current GeoDataFrame to the list
gdf_list.append(gdf)
# Combine all GeoDataFrames into a single one
final_gdf = gpd.GeoDataFrame(pd.concat(gdf_list, ignore_index=True))
# Color the features based on their geometry type
# Add a 'color' column to the GeoDataFrame
final_gdf['color'] = final_gdf['Geometry'].map(geometry_colors)
# Display the final GeoDataFrame with explore() (interactive map)
display(final_gdf.explore(
color=final_gdf['color'],
marker_kwds={
"radius": 4, # Point size
"fillOpacity": 0, # Fill opacity for points
"opacity": 1 # Stroke opacity for points
},
style_kwds={
"weight": 3, # Line width
"opacity": 1 # Line opacity (1 = fully opaque)
}
))
return gdf
except Exception as e:
error_message = str(e)
if "No data elements in server response" in error_message:
print(f"Error fetching data for {network}: {error_message}")
return None
else:
raise
Working with street network using osmnx#
def get_street_network(country, network, database):
"""
Download, model, and save a street network for a specified country and network type.
Parameters:
country (str): The name of the country for which the street network is to be modeled.
network (str): The type of network for which the street network is to be modeled.
database (str): The name of the database where the output files will be saved.
"""
# Get the abbreviation for the country
country_abbr = EU_code.get(country, country)
# Get the tags for the given network type
tags = get_network_tags(network)
if tags is None:
print(f"No tags defined for network type '{network}'")
return
network_code = map_network_to_code(network)
# Define the types of roadways you want to include
road_types = [
"motorway",
"primary",
"secondary",
"tertiary",
"trunk",
"motorway_link",
"primary_link",
"secondary_link",
"tertiary_link",
"trunk_link",
]
# Download/model a street network for the specified country, including only specified road types
features = ox.graph_from_place(
country,
network_type="drive",
custom_filter='["highway"~"{}"]'.format("|".join(road_types)),
retain_all=True,
simplify=True,
truncate_by_edge=True,
)
# explore nodes and edges together in a single map
gdf_nodes, gdf_edges = ox.graph_to_gdfs(features)
# Save graph as a GeoPackage
output_directory = os.path.join(
Miraca_Exposure_Database_path, network, database, country_abbr
)
os.makedirs(output_directory, exist_ok=True)
output_filename = os.path.join(
output_directory, f"{network_code}_{country_abbr}.gpkg"
)
ox.save_graph_geopackage(features, output_filename)
# Read the edges using GeoPandas
gdf_L = gpd.read_file(output_filename, layer="edges")
# Add columns for country name and abbreviation
gdf_L["Country"] = country
gdf_L["Country_code"] = country_abbr
gdf_L["Geometry"] = "Linestring"
# Define the path for the output GeoParquet file
geoparquet_path = os.path.join(
output_directory, f"{network_code}L_{country_abbr}.parquet"
)
# Write the GeoDataFrame to GeoParquet format
gdf_L.to_parquet(geoparquet_path, index=None)
# Read the nodes using GeoPandas
gdf_P = gpd.read_file(output_filename, layer="nodes")
# Add columns for country name and abbreviation
gdf_P["Country"] = country
gdf_P["Country_code"] = country_abbr
gdf_P["Geometry"] = "Point"
# Define the path for the output GeoParquet file
geoparquet_path = os.path.join(
output_directory, f"{network_code}P_{country_abbr}.parquet"
)
# Write the GeoDataFrame to GeoParquet format
gdf_P.to_parquet(geoparquet_path, index=None)
# Automatically delete the GeoPackage file after saving Parquet files
if os.path.exists(output_filename):
os.remove(output_filename)
else:
print("GeoPackage file not found, nothing to delete.")
# Create base map with lines (edges)
m = gdf_L.explore(
color="black",
tiles="OpenStreetMap",
style_kwds={
"weight": 3, # Line thickness
"opacity": 1 # Full opacity (less transparent)
}
)
# Add points (nodes) to the map
gdf_P.explore(
m=m,
color="#ED5861",
marker_kwds={
"radius": 3, # Point size
"fillOpacity": 1, # Fill transparency
"opacity": 1 # Border transparency
}
)
# Display the map
display(m)
# List of networks
network_list = [
"Education",
"Healthcare",
"Airports",
"Ports",
"Railway",
"Telecommunication",
"Electricity",
]
Specify the network and country for which you would like to run the code
network_type = "Education"
country = "Cyprus"
print(f"Running for {network_type} in {country}")
gdf = select_data_osmnx(country, network_type, "OSM")
Running for Education in Cyprus
street_network = get_street_network(EU_code[country], "Roadway", "OSM")
C:\Users\Paraskevi Tsoumani\anaconda3\envs\miraca\Lib\site-packages\osmnx\_overpass.py:254: UserWarning: This area is 11 times your configured Overpass max query area size. It will automatically be divided up into multiple sub-queries accordingly. This may take a long time.
multi_poly_proj = utils_geo._consolidate_subdivide_geometry(poly_proj)